Goto

Collaborating Authors

 statistical pattern


Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models

Neural Information Processing Systems

As Archimedes famously said, ``Give me a lever long enough and a fulcrum on which to place it, and I shall move the world'', in this study, we propose to use a tiny Language Model (LM), \eg, a Transformer with 67M parameters, to lever much larger Vision-Language Models (LVLMs) with 9B parameters. Specifically, we use this tiny \textbf{Lever-LM} to configure effective in-context demonstration (ICD) sequences to improve the In-Context Learinng (ICL) performance of LVLMs. Previous studies show that diverse ICD configurations like the selection and ordering of the demonstrations heavily affect the ICL performance, highlighting the significance of configuring effective ICD sequences. Motivated by this and by re-considering the the process of configuring ICD sequence, we find this is a mirror process of human sentence composition and further assume that effective ICD configurations may contain internal statistical patterns that can be captured by Lever-LM. Then a dataset with effective ICD sequences is constructed to train Lever-LM. After training, given novel queries, new ICD sequences are configured by the trained Lever-LM to solve vision-language tasks through ICL. Experiments show that these ICD sequences can improve the ICL performance of two LVLMs compared with some strong baselines in Visual Question Answering and Image Captioning, validating that Lever-LM can really capture the statistical patterns for levering LVLMs.


When Researchers Say Mental Model/Theory of Mind of AI, What Are They Really Talking About?

Yin, Xiaoyun, Doost, Elmira Zahmat, Zhou, Shiwen, Yadav, Garima Arya, Gorman, Jamie C.

arXiv.org Artificial Intelligence

When researchers claim AI systems possess ToM or mental models, they are fundamentally discussing behavioral predictions and bias corrections rather than genuine mental states. LLMs achieving human-level performance on ToM laboratory tasks, these results are based only on behavioral mimicry. Humans develop theories to explain each other's behaviors (Sellars, 1956). This abstract concept is not related to any specific parameters or rubric; it's the theory of an experienced mental state that we Kosinski (2023) argues that individual artificial neurons in LLMs function like "Chinese rooms", which follow Mechanically speaking, this comparison makes sense. The key is to study how they interact, as humans typically don't question others' cognition during Meanwhile, Strachan et al. (2024) claim that LLM behavior is "indistinguishable from human Researchers like Gu et al. (2024) tested GPT -4 using their SimpleToM dataset.


Large Language Models as symbolic DNA of cultural dynamics

Pourdavood, Parham, Jacob, Michael, Deacon, Terrence

arXiv.org Artificial Intelligence

Although the recent wave of AI models, known as Large Language Models (LLMs), are seamlessly surpassing the Turing Test, this milestone has been overshadowed by their rapid commercialization and the profound ways they are already reshaping society. The pursuit of Artificial General Intelligence (AGI)--commonly defined as human-level intelligence--is touted as the next major milestone. Yet whether the continued progress within the current framework could ever lead to agency and meaning at the scale of AI itself remains an open and contested question. Critics argue that current LLMs operate through algorithmic mimicry, that is simulating intelligent behavior without embodying the principles behind it (Jaeger, 2024; Jaeger et al., 2024) . Artificial Neural Networks--the main framework behind LLMs--operate on behaviorist assumptions: a framework that focuses exclusively on observable input-output patterns while treating internal states as part of a "black box" to be optimized (Brooks, 1991; Sutton & Barto, 2015) . This does not mean LLMs do not have sophisticated engineering, but their structure is designed to optimize internal states based on input-output feedback loops. Even though the logic behind behaviorism is likely one of the key principles supporting an intelligent system, it likely is not sufficient for intelligence and is not what enables agency and intelligence in the first place (Dreyfus, 1992; Searle, 1980) . Furthermore, it would be naive to consider outward behavior of intelligence as having acquired intelligence or sentience since a good simulation can be powerful and convincing. To address such issues, alternative approaches grounded in organismal intelligence are emerging to instead explain the principles behind intelligence through intrinsic and goal-directed models of the body and its relationship to the environment (Deacon, 2012; Jacob, 2023; Jaeger et al., 2024; Levin, 2019; Roli et al., 2022; Varela et al., 1993; Watson, 2024) .


Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models

Neural Information Processing Systems

As Archimedes famously said, Give me a lever long enough and a fulcrum on which to place it, and I shall move the world'', in this study, we propose to use a tiny Language Model (LM), \eg, a Transformer with 67M parameters, to lever much larger Vision-Language Models (LVLMs) with 9B parameters. Specifically, we use this tiny \textbf{Lever-LM} to configure effective in-context demonstration (ICD) sequences to improve the In-Context Learinng (ICL) performance of LVLMs. Previous studies show that diverse ICD configurations like the selection and ordering of the demonstrations heavily affect the ICL performance, highlighting the significance of configuring effective ICD sequences. Motivated by this and by re-considering the the process of configuring ICD sequence, we find this is a mirror process of human sentence composition and further assume that effective ICD configurations may contain internal statistical patterns that can be captured by Lever-LM. Then a dataset with effective ICD sequences is constructed to train Lever-LM.


Statistical Patterns in the Equations of Physics and the Emergence of a Meta-Law of Nature

Constantin, Andrei, Bartlett, Deaglan, Desmond, Harry, Ferreira, Pedro G.

arXiv.org Artificial Intelligence

Physics, as a fundamental science, aims to understand the laws of Nature and describe them in mathematical equations. While the physical reality manifests itself in a wide range of phenomena with varying levels of complexity, the equations that describe them display certain statistical regularities and patterns, which we begin to explore here. By drawing inspiration from linguistics, where Zipf's law states that the frequency of any word in a large corpus of text is roughly inversely proportional to its rank in the frequency table, we investigate whether similar patterns for the distribution of operators emerge in the equations of physics. We analyse three corpora of formulae and find, using sophisticated implicit-likelihood methods, that the frequency of operators as a function of their rank in the frequency table is best described by an exponential law with a stable exponent, in contrast with Zipf's inverse power-law. Understanding the underlying reasons behind this statistical pattern may shed light on Nature's modus operandi or reveal recurrent patterns in physicists' attempts to formalise the laws of Nature. It may also provide crucial input for symbolic regression, potentially augmenting language models to generate symbolic models for physical phenomena. By pioneering the study of statistical regularities in the equations of physics, our results open the door for a meta-law of Nature, a (probabilistic) law that all physical laws obey.

  Genre: Research Report (1.00)

How The ChatGPT Watermark Works And Why It Could Be Defeated

#artificialintelligence

OpenAI's ChatGPT introduced a way to automatically create content but plans to introduce a watermarking feature to make it easy to detect are making some people nervous. This is how ChatGPT watermarking works and why there may be a way to defeat it. ChatGPT is an incredible tool that online publishers, affiliates and SEOs simultaneously love and dread. Some marketers love it because they're discovering new ways to use it to generate content briefs, outlines and complex articles. Online publishers are afraid of the prospect of AI content flooding the search results, supplanting expert articles written by humans.


Large language models have a reasoning problem

#artificialintelligence

This article is part of our coverage of the latest in AI research. Even before the recent craze about sentient chatbots, large language models (LLM) had been the source of much excitement and concern. In recent years, LLMs, deep learning models that have been trained on vast amounts of text, have shown remarkable performance on several benchmarks that are meant to measure language understanding. Large language models such as GPT-3 and LaMDA manage to maintain coherence over long stretches of text. They seem to be knowledgeable about different topics.


Chatbots: Still Dumb After All These Years

#artificialintelligence

In 1970, Marvin Minsky, recipient of the Turing Award ("the Nobel Prize of Computing"), predicted that within "three to eight years we will have a machine with the general intelligence of an average human being." The fundamental roadblock is that, although computer algorithms are really, really good at identifying statistical patterns, they have no way of knowing what these patterns mean because they are confined to MathWorld and never experience the real world. It's a brown-throated thrush, but in Germany it's called a halsenflugel, and in Chinese they call it a chung ling and even if you know all those names for it, you still know nothing about the bird–you only know something about people; what they call that bird. Now that thrush sings, and teaches its young to fly, and flies so many miles away during the summer across the country, and nobody knows how it finds its way," and so forth. There is a difference between the name of the thing and what goes on.


Meta-Learning of Compositional Task Distributions in Humans and Machines

Kumar, Sreejan, Dasgupta, Ishita, Cohen, Jonathan D., Daw, Nathaniel D., Griffiths, Thomas L.

arXiv.org Artificial Intelligence

Modern machine learning systems struggle with sample efficiency and are usually trained with enormous amounts of data for each task. This is in sharp contrast with humans, who often learn with very little data. In recent years, meta-learning, in which one trains on a family of tasks (i.e. a task distribution), has emerged as an approach to improving the sample complexity of machine learning systems and to closing the gap between human and machine learning. However, in this paper, we argue that current meta-learning approaches still differ significantly from human learning. We argue that humans learn over tasks by constructing compositional generative models and using these to generalize, whereas current meta-learning methods are biased toward the use of simpler statistical patterns. To highlight this difference, we construct a new meta-reinforcement learning task with a compositional task distribution. We also introduce a novel approach to constructing a "null task distribution" with the same statistical complexity as the compositional distribution but without explicit compositionality. We train a standard meta-learning agent, a recurrent network trained with model-free reinforcement learning, and compare it with human performance across the two task distributions. We find that humans do better in the compositional task distribution whereas the agent does better in the non-compositional null task distribution -- despite comparable statistical complexity. This work highlights a particular difference between human learning and current meta-learning models, introduces a task that displays this difference, and paves the way for future work on human-like meta-learning.


New AI Tool GPT-3 Ascends to New Peaks, But Proves How Far We Still Need to Travel

#artificialintelligence

If you want a glimpse of the future, check out how developers are using gpt-3. This natural language processor was trained on parameters ten times greater than its most sophisticated rival and can be used to answer questions and write astoundingly well. Creative professionals everywhere, from top coders to professional writers marvel at what gpt-3 can produce even now – in its relative infancy. Yesterday, New York Times tech columnist Farhad Manjoo wrote that the short glimpse the general public has taken of gpt-3 "is at once amazing, spooky, humbling, and more than a little terrifying. GPT-3 is capable of generating entirely original, coherent, and sometimes even factual prose. And not just prose -- it can write poetry, dialogue, memes, computer code, and who knows what else." Manjoo speculated on whether a similar but more advanced AI might replace him someday.